Bivariate zero-inflated regression for count data: a Bayesian approach with application to plant counts.
نویسندگان
چکیده
Lately, bivariate zero-inflated (BZI) regression models have been used in many instances in the medical sciences to model excess zeros. Examples include the BZI Poisson (BZIP), BZI negative binomial (BZINB) models, etc. Such formulations vary in the basic modeling aspect and use the EM algorithm (Dempster, Laird and Rubin, 1977) for parameter estimation. A different modeling formulation in the Bayesian context is given by Dagne (2004). We extend the modeling to a more general setting for multivariate ZIP models for count data with excess zeros as proposed by Li, Lu, Park, Kim, Brinkley and Peterson (1999), focusing on a particular bivariate regression formulation. For the basic formulation in the case of bivariate data, we assume that Xi are (latent) independent Poisson random variables with parameters λ i, i = 0, 1, 2. A bi-variate count vector (Y1, Y2) response follows a mixture of four distributions; p0 stands for the mixing probability of a point mass distribution at (0, 0); p1, the mixing probability that Y2 = 0, while Y1 = X0 + X1; p2, the mixing probability that Y1 = 0 while Y2 = X0 + X2; and finally (1 - p0 - p1 - p2), the mixing probability that Yi = Xi + X0, i = 1, 2. The choice of the parameters {pi, λ i, i = 0, 1, 2} ensures that the marginal distributions of Yi are zero inflated Poisson (λ 0 + λ i). All the parameters thus introduced are allowed to depend on co-variates through canonical link generalized linear models (McCullagh and Nelder, 1989). This flexibility allows for a range of real-life applications, especially in the medical and biological fields, where the counts are bivariate in nature (with strong association between the processes) and where there are excess of zeros in one or both processes. Our contribution in this paper is to employ a fully Bayesian approach consolidating the work of Dagne (2004) and Li et al. (1999) generalizing the modeling and sampling-based methods described by Ghosh, Mukhopadhyay and Lu (2006) to estimate the parameters and obtain posterior credible intervals both in the case where co-variates are not available as well as in the case where they are. In this context, we provide explicit data augmentation techniques that lend themselves to easier implementation of the Gibbs sampler by giving rise to well-known and closed-form posterior distributions in the bivariate ZIP case. We then use simulations to explore the effectiveness of this estimation using the Bayesian BZIP procedure, comparing the performance to the Bayesian and classical ZIP approaches. Finally, we demonstrate the methodology based on bivariate plant count data with excess zeros that was collected on plots in the Phoenix metropolitan area and compare the results with independent ZIP regression models fitted to both processes.
منابع مشابه
Hurdle, Inflated Poisson and Inflated Negative Binomial Regression Models for Analysis of Count Data with Extra Zeros
In this paper, we propose Hurdle regression models for analysing count responses with extra zeros. A method of estimating maximum likelihood is used to estimate model parameters. The application of the proposed model is presented in insurance dataset. In this example, there are many numbers of claims equal to zero is considered that clarify the application of the model with a zero-inflat...
متن کاملEstimation of Count Data using Bivariate Negative Binomial Regression Models
Abstract Negative binomial regression model (NBR) is a popular approach for modeling overdispersed count data with covariates. Several parameterizations have been performed for NBR, and the two well-known models, negative binomial-1 regression model (NBR-1) and negative binomial-2 regression model (NBR-2), have been applied. Another parameterization of NBR is negative binomial-P regression mode...
متن کاملAnalysis of Blood Transfusion Data Using Bivariate Zero-Inflated Poisson Model: A Bayesian Approach
Recognizing the factors affecting the number of blood donation and blood deferral has a major impact on blood transfusion. There is a positive correlation between the variables "number of blood donation" and "number of blood deferral": as the number of return for donation increases, so does the number of blood deferral. On the other hand, due to the fact that many donors never return to donate,...
متن کاملBayesian Zero- Inflated Poisson model for prognosis of demographic factors associated with using crystal meth in Tehran population
Background: Use of methamphetamine (MA) and other stimulants has increased steadily over the past 10 years. Risk factor evaluation to reduce the problem in the community is one solution to protect people from addiction. This study aimed at using Bayesian zero- inflated Poisson (ZIP) model to investigate the relationship between the number of using crystal meth and some demogr...
متن کاملBayesian Inference for Zero-inflated Poisson Regression Models
Count data with excess zeros are common in social science research and can be considered as a special case of mixture structured data. We exploit the flexibility of the Bayesian analytic approach to model the mixture data structure inherent in zero-inflated count data by using the zero-inflated Poisson (ZIP) model. We discuss the importance of modelling excess-zero count data in social sciences...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The international journal of biostatistics
دوره 6 1 شماره
صفحات -
تاریخ انتشار 2010